Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Artículo en Inglés | MEDLINE | ID: mdl-37665699

RESUMEN

Monitoring the healthy development of a fetus requires accurate and timely identification of different maternal-fetal structures as they grow. To facilitate this objective in an automated fashion, we propose a deep-learning-based image classification architecture called the COMFormer to classify maternal-fetal and brain anatomical structures present in 2-D fetal ultrasound (US) images. The proposed architecture classifies the two subcategories separately: maternal-fetal (abdomen, brain, femur, thorax, mother's cervix (MC), and others) and brain anatomical structures [trans-thalamic (TT), trans-cerebellum (TC), trans-ventricular (TV), and non-brain (NB)]. Our proposed architecture relies on a transformer-based approach that leverages spatial and global features using a newly designed residual cross-variance attention block. This block introduces an advanced cross-covariance attention (XCA) mechanism to capture a long-range representation from the input using spatial (e.g., shape, texture, intensity) and global features. To build COMFormer, we used a large publicly available dataset (BCNatal) consisting of 12 400 images from 1792 subjects. Experimental results prove that COMFormer outperforms the recent CNN and transformer-based models by achieving 95.64% and 96.33% classification accuracy on maternal-fetal and brain anatomy, respectively.


Asunto(s)
Encéfalo , Ultrasonografía Prenatal , Femenino , Embarazo , Humanos , Encéfalo/diagnóstico por imagen , Ultrasonografía , Suministros de Energía Eléctrica , Fémur
2.
Med Image Anal ; 82: 102630, 2022 11.
Artículo en Inglés | MEDLINE | ID: mdl-36223683

RESUMEN

In this work, we present a novel gaze-assisted natural language processing (NLP)-based video captioning model to describe routine second-trimester fetal ultrasound scan videos in a vocabulary of spoken sonography. The primary novelty of our multi-modal approach is that the learned video captioning model is built using a combination of ultrasound video, tracked gaze and textual transcriptions from speech recordings. The textual captions that describe the spatio-temporal scan video content are learnt from sonographer speech recordings. The generation of captions is assisted by sonographer gaze-tracking information reflecting their visual attention while performing live-imaging and interpreting a frozen image. To evaluate the effect of adding, or withholding, different forms of gaze on the video model, we compare spatio-temporal deep networks trained using three multi-modal configurations, namely: (1) a gaze-less neural network with only text and video as input, (2) a neural network additionally using real sonographer gaze in the form of attention maps, and (3) a neural network using automatically-predicted gaze in the form of saliency maps instead. We assess algorithm performance through established general text-based metrics (BLEU, ROUGE-L, F1 score), a domain-specific metric (ARS), and metrics that consider the richness and efficiency of the generated captions with respect to the scan video. Results show that the proposed gaze-assisted models can generate richer and more diverse captions for clinical fetal ultrasound scan videos than those without gaze at the expense of the perceived sentence structure. The results also show that the generated captions are similar to sonographer speech in terms of discussing the visual content and the scanning actions performed.


Asunto(s)
Algoritmos , Redes Neurales de la Computación , Humanos , Embarazo , Femenino , Ultrasonografía Prenatal
3.
Diagnostics (Basel) ; 13(1)2022 Dec 29.
Artículo en Inglés | MEDLINE | ID: mdl-36611396

RESUMEN

Medical image analysis methods for mammograms, ultrasound, and magnetic resonance imaging (MRI) cannot provide the underline features on the cellular level to understand the cancer microenvironment which makes them unsuitable for breast cancer subtype classification study. In this paper, we propose a convolutional neural network (CNN)-based breast cancer classification method for hematoxylin and eosin (H&E) whole slide images (WSIs). The proposed method incorporates fused mobile inverted bottleneck convolutions (FMB-Conv) and mobile inverted bottleneck convolutions (MBConv) with a dual squeeze and excitation (DSE) network to accurately classify breast cancer tissue into binary (benign and malignant) and eight subtypes using histopathology images. For that, a pre-trained EfficientNetV2 network is used as a backbone with a modified DSE block that combines the spatial and channel-wise squeeze and excitation layers to highlight important low-level and high-level abstract features. Our method outperformed ResNet101, InceptionResNetV2, and EfficientNetV2 networks on the publicly available BreakHis dataset for the binary and multi-class breast cancer classification in terms of precision, recall, and F1-score on multiple magnification levels.

4.
Med Image Underst Anal (2022) ; 13413: 187-198, 2022 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-36848308

RESUMEN

Medical image captioning models generate text to describe the semantic contents of an image, aiding the non-experts in understanding and interpretation. We propose a weakly-supervised approach to improve the performance of image captioning models on small image-text datasets by leveraging a large anatomically-labelled image classification dataset. Our method generates pseudo-captions (weak labels) for caption-less but anatomically-labelled (class-labelled) images using an encoder-decoder sequence-to-sequence model. The augmented dataset is used to train an image-captioning model in a weakly supervised learning manner. For fetal ultrasound, we demonstrate that the proposed augmentation approach outperforms the baseline on semantics and syntax-based metrics, with nearly twice as much improvement in value on BLEU-1 and ROUGE-L. Moreover, we observe that superior models are trained with the proposed data augmentation, when compared with the existing regularization techniques. This work allows seamless automatic annotation of images that lack human-prepared descriptive captions for training image-captioning models. Using pseudo-captions in the training data is particularly useful for medical image captioning when significant time and effort of medical experts is required to obtain real image captions.

5.
Proc IEEE Int Symp Biomed Imaging ; 2021: 716-720, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-34413932

RESUMEN

We propose a curriculum learning captioning method to caption fetal ultrasound images by training a model to dynamically transition between two different modalities (image and text) as training progresses. Specifically, we propose a course-focused dual curriculum method, where a course is training with a curriculum based on only one of the two modalities involved in image captioning. We compare two configurations of the course-focused dual curriculum; an image-first course-focused dual curriculum which prepares the early training batches primarily on the complexity of the image information before slowly introducing an order of batches for training based on the complexity of the text information, and a text-first course-focused dual curriculum which operates in reverse. The evaluation results show that dynamically transitioning between text and images over epochs of training improves results when compared to the scenario where both modalities are considered in equal measure in every epoch.

6.
Sci Rep ; 11(1): 14109, 2021 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-34238950

RESUMEN

Ultrasound is the primary modality for obstetric imaging and is highly sonographer dependent. Long training period, insufficient recruitment and poor retention of sonographers are among the global challenges in the expansion of ultrasound use. For the past several decades, technical advancements in clinical obstetric ultrasound scanning have largely concerned improving image quality and processing speed. By contrast, sonographers have been acquiring ultrasound images in a similar fashion for several decades. The PULSE (Perception Ultrasound by Learning Sonographer Experience) project is an interdisciplinary multi-modal imaging study aiming to offer clinical sonography insights and transform the process of obstetric ultrasound acquisition and image analysis by applying deep learning to large-scale multi-modal clinical data. A key novelty of the study is that we record full-length ultrasound video with concurrent tracking of the sonographer's eyes, voice and the transducer while performing routine obstetric scans on pregnant women. We provide a detailed description of the novel acquisition system and illustrate how our data can be used to describe clinical ultrasound. Being able to measure different sonographer actions or model tasks will lead to a better understanding of several topics including how to effectively train new sonographers, monitor the learning progress, and enhance the scanning workflow of experts.

7.
Med Image Comput Comput Assist Interv ; 12263: 534-543, 2020 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-33103162

RESUMEN

In medical imaging, manual annotations can be expensive to acquire and sometimes infeasible to access, making conventional deep learning-based models difficult to scale. As a result, it would be beneficial if useful representations could be derived from raw data without the need for manual annotations. In this paper, we propose to address the problem of self-supervised representation learning with multi-modal ultrasound video-speech raw data. For this case, we assume that there is a high correlation between the ultrasound video and the corresponding narrative speech audio of the sonographer. In order to learn meaningful representations, the model needs to identify such correlation and at the same time understand the underlying anatomical features. We designed a framework to model the correspondence between video and audio without any kind of human annotations. Within this framework, we introduce cross-modal contrastive learning and an affinity-aware self-paced learning scheme to enhance correlation modelling. Experimental evaluations on multi-modal fetal ultrasound video and audio show that the proposed approach is able to learn strong representations and transfers well to downstream tasks of standard plane detection and eye-gaze prediction.

8.
Artículo en Inglés | MEDLINE | ID: mdl-33103165

RESUMEN

We present a novel curriculum learning approach to train a natural language processing (NLP) based fetal ultrasound image captioning model. Datasets containing medical images and corresponding textual descriptions are relatively rare and hence, smaller-sized when compared to the datasets of natural images and their captions. This fact inspired us to develop an approach to train a captioning model suitable for small-sized medical data. Our datasets are prepared using real-world ultrasound video along with synchronised and transcribed sonographer speech recordings. We propose a "dual-curriculum" method for the ultrasound image captioning problem. The method relies on building and learning from curricula of image and text information for the ultrasound image captioning problem. We compare several distance measures for creating the dual curriculum and observe the best performance using the Wasserstein distance for image information and tf-idf metric for text information. The evaluation results show an improvement in all performance metrics when using curriculum learning over stochastic mini-batch training for the individual task of image classification as well as using a dual curriculum for image captioning.

9.
Artículo en Inglés | MEDLINE | ID: mdl-31976493

RESUMEN

We describe an automatic natural language processing (NLP)-based image captioning method to describe fetal ultrasound video content by modelling the vocabulary commonly used by sonographers and sonologists. The generated captions are similar to the words spoken by a sonographer when describing the scan experience in terms of visual content and performed scanning actions. Using full-length second-trimester fetal ultrasound videos and text derived from accompanying expert voice-over audio recordings, we train deep learning models consisting of convolutional neural networks and recurrent neural networks in merged configurations to generate captions for ultrasound video frames. We evaluate different model architectures using established general metrics (BLEU, ROUGE-L) and application-specific metrics. Results show that the proposed models can learn joint representations of image and text to generate relevant and descriptive captions for anatomies, such as the spine, the abdomen, the heart, and the head, in clinical fetal ultrasound scans.

10.
Neurology ; 86(24): 2264-70, 2016 06 14.
Artículo en Inglés | MEDLINE | ID: mdl-27170570

RESUMEN

OBJECTIVE: To determine quantitative size thresholds for enlargement of the optic nerve, chiasm, and tract in children with neurofibromatosis type 1 (NF1). METHODS: Children 0.5-18.6 years of age who underwent high-resolution T1-weighted MRI were eligible for inclusion. This consisted of children with NF1 with or without optic pathway gliomas (OPGs) and a control group who did not have other acquired, systemic, or genetic conditions that could alter their anterior visual pathway (AVP). Maximum and average diameter and volume of AVP structures were calculated from reconstructed MRI images. Values above the 95th percentile from the controls were considered the threshold for defining an abnormally large AVP measure. RESULTS: A total of 186 children (controls = 82; NF1noOPG = 54; NF1+OPG = 50) met inclusion criteria. NF1noOPG and NF1+OPG participants demonstrated greater maximum optic nerve diameter and volume, optic chiasm volume, and total brain volume compared to controls (p < 0.05, all comparisons). Total brain volume, rather than age, predicted optic nerve and chiasm volume in controls (p < 0.05). Applying the 95th percentile threshold to all NF1 participants, the maximum optic nerve diameter (3.9 mm) and AVP volumes resulted in few false-positive errors (specificity >80%, all comparisons). CONCLUSIONS: Quantitative reference values for AVP enlargement will enhance the development of objective diagnostic criteria for OPGs secondary to NF1.


Asunto(s)
Imagen por Resonancia Magnética , Neurofibromatosis 1/diagnóstico por imagen , Nervio Óptico/diagnóstico por imagen , Adolescente , Encéfalo/diagnóstico por imagen , Encéfalo/crecimiento & desarrollo , Niño , Preescolar , Femenino , Humanos , Interpretación de Imagen Asistida por Computador/métodos , Lactante , Imagen por Resonancia Magnética/métodos , Masculino , Neurofibromatosis 1/complicaciones , Nervio Óptico/crecimiento & desarrollo , Glioma del Nervio Óptico/diagnóstico por imagen , Glioma del Nervio Óptico/etiología , Tamaño de los Órganos , Estudios Retrospectivos , Adulto Joven
11.
IEEE Trans Med Imaging ; 35(8): 1856-65, 2016 08.
Artículo en Inglés | MEDLINE | ID: mdl-26930677

RESUMEN

Analysis of cranial nerve systems, such as the anterior visual pathway (AVP), from MRI sequences is challenging due to their thin long architecture, structural variations along the path, and low contrast with adjacent anatomic structures. Segmentation of a pathologic AVP (e.g., with low-grade gliomas) poses additional challenges. In this work, we propose a fully automated partitioned shape model segmentation mechanism for AVP steered by multiple MRI sequences and deep learning features. Employing deep learning feature representation, this framework presents a joint partitioned statistical shape model able to deal with healthy and pathological AVP. The deep learning assistance is particularly useful in the poor contrast regions, such as optic tracts and pathological areas. Our main contributions are: 1) a fast and robust shape localization method using conditional space deep learning, 2) a volumetric multiscale curvelet transform-based intensity normalization method for robust statistical model, and 3) optimally partitioned statistical shape and appearance models based on regional shape variations for greater local flexibility. Our method was evaluated on MRI sequences obtained from 165 pediatric subjects. A mean Dice similarity coefficient of 0.779 was obtained for the segmentation of the entire AVP (optic nerve only =0.791 ) using the leave-one-out validation. Results demonstrated that the proposed localized shape and sparse appearance-based learning approach significantly outperforms current state-of-the-art segmentation approaches and is as robust as the manual segmentation.


Asunto(s)
Vías Visuales , Humanos , Imagen por Resonancia Magnética , Modelos Estadísticos , Reproducibilidad de los Resultados
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...